NASA Contractor Report 181860 
ICASE Report No. 89-37 


ICASE 


BUNCH-KAUFMAN FACTORIZATION FOR REAL 
SYMMETRIC INDEFINITE BANDED MATRICES 


Mark T. Jones 
Merrell L. Patrick 

(NASA-CB-181860) BUNCH-KAUFMAN R89-28341 

FACTORIZATION FOR REAL SYMMETRIC INDEFINITE 
BANDED MATRICES Final Report (ICASE) 15 p 

CSCL 1 2 A Unclas 

63/64 0224027 


Contract No. NAS 1-1 8605 
May 1989 

Institute for Computer Applications in Science and Engineering 
NASA Langley Research Center 
Hampton, Virginia 23665-5225 

Operated by the Universities Space Research Association 


NASA 

National Aeronautics and 
Space Administration 

Langley Research Center 

Hampton, Virginia 23665-5225 



Recently, ICASE has begun differentiating between reports with a mathemat- 
ical or applied science theme and reports whose main emphasis is some aspect of 
computer science by producing the computer science reports with a yellow cover. 
The blue cover reports will now emphasize mathematical research. In all other 
aspects the reports will remain the same; in particular, they will continue to be 
submitted to the appropriate journals or conferences for formal publication. 



Bunch-Kaufman Factorization for Real 
Symmetric Indefinite Banded Matrices 

Mark T. Jones*and Merrell L. Patrick*^ 


Abstract 

The Bunch-Kaufman algorithm for factoring symmetric indefinite 
matrices has been rejected for banded matrices because it destroys the 
banded structure of the matrix. Herein, it is shown that for a sub- 
class of real symmetric matrices which arise in solving the generalized 
eigenvalue problem using Lanczos’s method, the Bunch-Kaufman al- 
gorithm does not result in major destruction of the bandwidth. Space 
time complexities of the algorithm are given and used to show that 
the Bunch-Kaufman algorithm is a significant improvement over LU 
factorization. 
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1 Introduction 

The Bunch-Kaufman algorithm is considered one of the best methods for 
factoring full, symmetric, indefinite matrices [BK77], [BG76]. It has also 
been modified and successfully used to factor sparse matrices [DRMN79]. 
However, to date it has been rejected for banded, symmetric indefinite ma- 
trices because it destroys the banded structure of the matrix [BK77]. Herein 
it is shown that for a subclass of real symmetric indefinite matrices, which 
arise in solving the generalized eigenvalue problem using Lanczos’s method, 
the Bunch-Kaufman algorithm does not result in major destruction of the 
bandwidth. F urthermore, for our class of problems, the Bunch-Kaufman 
factorization algorithm is a significant improvement over LU factorization, 
the standard of comparison for such methods [BK77]. In addition to taking 
advantage of symmetry, the Bunch-Kaufman algorithm yields the inertia of 
the matrix essentially for free [BK77], which is important in eigenvalue cal- 
culations. LU factorization does not yield the inertia as a by-product and 
destroys the symmetry of the matrix, thus increasing storage requirements 
for its implementation. 

In section 2 we give one of the several variations of the Bunch-Kaufman 
algorithm and in section 3 describe a subclass of matrices to which we apply 
it. An efficient implementation of the method is described in section 4 and 
the space/time complexity of the implementation is disussed in section 5. 
Conclusions are drawn in section 6. 


2 The Bunch-Kaufman Algorithm 

The Bunch-Kaufman algorithm factors A, an nxn real symmetric indefinite 
matrix, into LDL T while doing symmetric permutations on A to maintain 
stability, resulting in the following equation: 

PAP t = LDL t . (1) 

Although several variations of the algorithm exist, algorithm D of the 
Bunch-Kaufman paper is the least destructive of the banded structure 
[BK77]. The algorithm is shown in figure 1. 
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1) for t = 1, n 

begin 

2) if the previous step was not a 2x2 pivot then 
begin 

3) A = max J=J+l n | a jti \ 

4) set r to the row number of this value 

5) if A a < | a, ,• | then 
begin 

6) perform a lxl pivot 
end 

else 

begin 

7) o max ;=l+ x,n | ®r,y | 

8) if a A 2 < o | a, jt | then 
begin 

9) perform a lxl pivot 
end 

else 

begin 

10) exchange rows and columns r and t + 1 

11) perform a 2x2 pivot 
end 

end 

end 

12) end 

13) if inertia is desired, then scan the D matrix 

Figure 1: The Bunch-Kaufman Factorization Algorithm 
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The parameter, a, is chosen such that stability is maximized and has 
been shown by Bunch and Kaufman to be approximately 0.525 [BK77]. 
The exchange of rows and columns in step 10 maintains the symmetry of 
the matrix, unlike LU factorization which destroys the symmetry of the 
matrix by permuting only rows. 


3 Applicable Set of Matrices 

Bunch and Kaufman show that, in general, if m is the semi-bandwidth of 
a matrix being factored, then a 2x2 pivot can increase the semi-bandwidth 
from m to (2m) — 2 and that this can happen at every step thus resulting 
in the complete destruction of the band structure due to fill-in outside the 
band [BK77]. However, it will be shown in section four that for a subclass 
of matrices this fill-in can be controlled. The number of 2x2 pivots is 
bounded above by the number of negative eigenvalues of A, because each 
2x2 pivot represents a positive-negative eigenvalue pair [BK77]. Also, the 
increase of the semi-bandwidth from m to (2m) — 2 is a worst case that 
in practice is not likely to occur. Therefore, for matrices with a small 
number of negative eigenvalues (in relation to the size of the matrix), it is 
possible to use Bunch-Kaufman factorization with very little fill-in. Such 
matrices arise in eigenvalue calculations where the smallest eigenvalues are 
sought. Methods such as inverse iteration and Lanczos’s method are often 
used to find the smallest eigenvalues of a matrix, A. To do so, they often 
require the factorization of a matrix, (A — <rJ), where a is normally very 
near the left end of A’s spectrum, but may not be to the left of the smallest 
eigenvalue, thus the matrix is indefinite [NOPT83] but has only a small 
number of negative eigenvalues. These matrices can be banded, as they 
are in structural mechanics [BH87]. The difficulty is that the location and 
amount of the fill-in outside the band is not possible to predict a priori. 
In the following section, a detailed algorithm which dynamically allows for 
fill-in during factorization will be presented. 


4 Efficient Implementation of the Algorithm 
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Figure 2: Example of Fill-in (Note: this is an example of worst case fill-in) 

As the following algorithm is executed the original matrix is copied, 
piece-wise, from one place in memory to another. This allows for dynamic 
allocation of fill-in as well as only requiring part of the matrix to be in main 
memory at any particular time. Fill-in only takes place in a small triangle 
when a 2x2 pivot occurs. If a pivot occurs at step k, this triangle is of 
the form shown in figure 2, where •’s represent eliminated elements in L, 
x’s are uneliminated non-zeros, 0’s are zeros outside the band for which no 
storage is needed, and / s are areas where fill occurs. The triangle of fill is 
from row r + 1 to row r + m, where m is the semi-bandwidth (this area may 
already contain non-zeros depending on the value of r, so no extra memory 
may be needed). The algorithm is as follows: 

0) set upto to 0 

1) for i = 1, n 
begin 

2) if the previous step was not a 2x2 pivot then 
begin 

3) read rows upto to min(n,i + m) of the matrix A into L, 
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no extra space for fill needs to be added for these rows 

4) 

set upto to min(n,t + m) 

5) 

A = max; =»-(- 1 ,upto | 1 

6) 

set r to the row number of this value 

7) 

if A a <| a,,, | then 


begin 

8) 

go to 11 


end 

9) 

o — max J=1+ 1 t upto | ®r,j | 

C 

(it may be necessary to access some elements that are not 

C 

read in at this point, but the number of elements is small, 

C 

so they may be read into L or simply discarded, 

C 

this is only a concern if i/o is taking place) 

10) 

if a A 2 < o | a.,i | then 


begin 

c 

perform a lxl pivot 

11) 

set pi — i 

12) 

set di i = a,i i 

13) 

set d, , + 1 = 0.0 

14) 

set aij = 1.0 

15) 

for j = t + 1, upto 


begin 

16) 

Vj = CL j t i 

17) 

J? 

■«% 

II 

■«^v 

18) 

a j,i = vl i 


end 

19) 

for j = t + 1, upto 


begin 

20) 

for k = i + 1, J 


begin 
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21) 


a j,k = a i,k - v h V k 
end 

end 
end 
else 
begin 

permute the matrix and then perform a 2x2 pivot 

22) read rows upto to min(n,r + m) of the matrix A into L, 
and allocate space for the fill triangle 

23) set upto to min(n,r + m) 

24) exchange rows and columns r and t + 1 

25) set pi = i 

26) set p l+1 = r 

27) set di i = a, ,- 

28) set d,+i ,»•+! = a, +1 f+1 

29) set dii+i — 

30) set di+i'i+t = 0.0 

31) set determinant = (((d,,,d I+1> , +1 )/d,- 1+1 ) - d iji+1 )d i>i+1 

32) for j = t ’ + 2, upto 
begin 

33) Vj = a Jt 

34) v2j = a jii+1 

v h = a j,idi+ i,i+i — aj,«+id»,,+x 
^®) vl2j 

37 ) o-j.i = v/y 

38) a ;i+1 = v/2j 
end 

39) for j = * + 2, upfo 
begin 

40) for k = * + 2, j 
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41) 


begin 

a i,k = O-j.k - {vljVk + vl2jV2t) 
end 
end 
end 
end 
end 

p is a vector representing the permutation matrices. The only time fill 
outside the band occurs is in step 24 of the algorithm when a 2x2 pivot 
occurs and then storage for the fill is allocated dynamically. 


5 Speed and Storage Analysis 

In this section we compare the space/time requirements of our implementa- 
tion of the Bunch-Kaufman algorithm with LU factorization. The storage 
requirements for both algorithms will be analyzed for two different situa- 
tions: 1) when simply factoring a matrix that falls in the subclass described 
in section 2, and 2) when factoring a matrix pencil such as (K-aM) where 
K and M are symmetric, K is positive definite and a is near the left end 
of K ' s spectrum. 

In the first situation, the storage required by the algorithm presented 
in section 4 is significantly less than that required by LU factorization for 
the set of matrices that was described in section 2. The storage required 
by LU factorization is approximately 3 mn [BK77], The storage needed by 
this implementation of Bunch-Kaufman is mn for the original storage from 
which the matrix is copied, plus mn for the locations to which the matrix 
is copied, plus an additional amount C which is the amount of storage 
necessary for the fill-in triangles. C is much less than mn, because of the 
small number of negative eigenvalues. In addition, two vectors of length n 
are needed for storing the D matrix giving a total of 2n.(m+l) +C. So when 
C is small, approximately (m - 2)n storage locations are saved factoring 
matrices using the Bunch-Kaufman algorithm instead of LU factorization. 

In the second situation (which arises in an efficient implementation of 
Lanczos’s method for solving Kx = XMx'j , the shift o may change during 
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Method 

adds. 

mults. 

divisions 

sq. roots 

comps. 

fill 

Choi. 

44433080 

48140336 

1824 

1824 

0 

0 

B-K 

48277445 

48686784 

1831 

0 

446326 

2083 

LU 

137241687 

137648943 

1823 

0 

409079 

' 2mn 


Figure 3: Operation Counts for Factorization: n=1824, m=240, 5 negative 
eigenvalues 
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0 

B-K 

52023663 

52445756 

1837 

0 

485452 

14837 

LU 

137412094 

137819350 

1823 

0 

409079 
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Figure 4: Operation Counts for Factorization: n=1824, m=240, 19 negative 
eigenvalues 


execution of the algorithm, so K and AI must be saved throughout the com- 
putation. In this situation, the storage requirements for LU factorization 
is increased to (4mn), but the storage needed by Bunch-Kaufman remains 

the same, namely 2n(m + 1) + C making it even more attractive in this 
case. 

The operation counts for factorization are the same in both cases. The 
operation count for Bunch-Kaufman is significantly less than that of LU 
factorization because symmetry is exploited and the fill-in is limited. For 
simplicity, the operations added by the fill-in during Bunch-Kaufman are 
ignored, since the amount that is added is trivial. The high order term 
in the operation counts for Bunch-Kaufman is approximately nm ! arith- 
metic operations plus approximately nm comparisons while the high order 
term for the operation counts for LU is approximately 4nm 2 arithmetic 
operations plus approximately nm comparisons. 

The Bunch-Kaufman method also vectorizes well if the semi-bandwidth 
is large enough. The gains from vectorization are much the same as those 
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Figure 5: Operation Counts for Factorization: n=1980, m— 59, 5 negative 
eigenvalues 
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Figure 6 : Operation Counts for Factorization: n=1980, m— 59, 15 negative 
eigenvalues 
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19 

7 
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59 
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3 
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59 

. - 5~ 
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Figure 7: The number of 2x2 pivots for each problem 
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for Choleski factorization. 

The operations counts for both types of factorization, as well as Choleski 
factorization, when using Lanczos’s method for solving the generalized 
eigenvalue problem are given in figures 3, 4, 5, and 6. The fill-in dur- 
ing factorization is also shown in these figures. The amount of fill-in when 
using Bunch-Kaufman can be seen to increase when the number of nega- 
tive eigenvalues increases. The implementation of LU factorization that is 
used for the comparison is sgbfa from the Linpack package [DBMS78]. The 
measurements for Choleski factorization are given only as a reference point, 
the matrices that were solved were shifted to make them positive definite 
for the Choleski factorization runs, otherwise Choleski factorization would 
have failed due to the indefiniteness of the system. These matrices arise 
from a problem in a structural engineering application [BH87]. In figure 7 
the number of 2x2 pivots that occurred in each problem can be examined. 

The solution phase that occurs after factorization takes slightly longer 
for Bunch-Kaufman than for LU factorization due to the fact that three 
matrices, L, D, and L ( , arise from Bunch-Kaufman (see equation 1) rather 
than just two matrices, L and U, that arise from LU factorization. This 
solution phase however takes much less time than factorization, so this is 
not significant. 

6 Conclusions 

The Bunch-Kaufman method has been shown to be a more efficient fac- 
torization method than LU factorization in terms of time and storage for 
banded real symmetric indefinite matrices with a small number of eigenval- 
ues. An algorithm has been presented that greatly limits the fill needed for 
factorization as well as taking advantage of the symmetry of the matrix. 
This method has been shown to be nearly as stable as LU factorization by 
Bunch [BK77]. 
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